Feature Level Compensation for Robust Speaker Identification in Mismatched Conditions

نویسندگان

  • Sharada V Chougule
  • Mahesh S Chavan
چکیده

In this paper, robust front end features are proposed for improvement in speaker identification (SI) performance by considering the factors of real world situations, like mismatch between training and testing conditions. The most commonly used MFCC features are very much sensitive to effects such as channel and environment mismatch. Characteristics of speech gets changed with room acoustics, channel and microphone as well as background noise, which adversely affects the performance of the SI system. To make the front end features robust, asymmetric hamming-cosine taper is used, which gives better spectral estimation and reduces the interfering band limited noise. To incorporate time varying information, second order derivatives of cepstral coefficients are concatenated to MFCC features. Convolutional errors are minimized by using cepstral mean normalization (CMN) and compensation to additive noise is achieved by magnitude spectral subtraction (MSS). The performance of closed-set text independent speaker identification system is evaluated under different train and test conditions such as sensor (microphone) mismatch, speaking style mismatch, language mismatch and environment mismatch using IIT-G multi-variability speech database developed for speaker recognition purpose. The experimental results shows that, these modified front end features outperform conventional baseline MFCC features in mismatched conditions especially for sensor and environment mismatch. It is observed that changes in speaking style and language mismatch affects less on performance accuracy. Keywords—Text Independent SpeakerIdentification, MFCC,Cepstral Mean Normalization, Magnitude Spectral Subtraction

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Speaker Recognition

The automatic speaker recognition technologies have developed into more and more important modern technologies required by many speech-aided applications. The main challenge for automatic speaker recognition is to deal with the variability of the environments and channels from where the speech was obtained. In previous work, good results have been achieved for clean high-quality speech with mat...

متن کامل

Assessment of single-channel speech enhancement techniques for speaker identification under mismatched conditions

It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched train and test conditions. In this study, we report on evaluation of four different single-channel speech enhancement front-ends for robust SID under such conditions. Speech files from the YOHO database are corrupted with four types of noise including babble, car, factory, and white at five ...

متن کامل

Speaker Recognition with Mismatched Coded Speech

This paper investigates the effects of low-bit rate coded speech on the performance of a fixedtext speaker recognition system, under mismatched coding conditions between enrollment and testing. Significant degradation of performance has been observed relative to matched conditions, where same coding is used. Two techniques have been proposed to overcome mismatch effects; a linear discriminative...

متن کامل

A model-based transformational approach to robust speaker recognition

A novel statistical modeling and compensation method for robust speaker recognition is presented. The method specifically addresses the degradation in speaker verification performance due to the mismatch in channels (e.g., telephone handsets) between enrollment and testing sessions. In mismatched conditions , the new approach uses speaker-independent channel transformations to synthesize a spea...

متن کامل

Speaker Identification Using Ensembles of Feature Enhancement Methods

In this paper, we propose a classifier ensemble of various channel compensation and feature enhancement methods for robust speaker identification on various environments. The proposed ensemble system is constructed with 15 classifiers including three channel compensation methods (including CMS and variance normalization, and without compensation) and five feature enhancement methods (including ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014